fix: type error & better error handling by Harshit28j · Pull Request #20689 · BerriAI/litellm

Harshit28j · 2026-02-08T02:41:12Z

Relevant issues

Presidio API returned a plain string (e.g. an error message) instead of a list of dicts. That often occurred with web search or hosted models. The code treated analyze_results as a list and iterated over it; for a string, it iterated over characters and tried PresidioAnalyzeResponseItem(**"e"), which triggered the error.

Pre-Submission checklist

Please complete all items before asking a LiteLLM maintainer to review your PR

I have Added testing in the tests/litellm/ directory, Adding at least 1 test is a hard requirement - see details
My PR passes all unit tests on make test-unit
My PR's scope is as isolated as possible, it only solves 1 specific problem

CI (LiteLLM team)

CI status guideline:

50-55 passing tests: main is stable with minor issues.

45-49 passing tests: acceptable but needs attention

<= 40 passing tests: unstable; be careful with your merges and assess the risk.

Branch creation CI run
Link:
CI run for the last commit
Link:
Merge / cherry-pick CI run
Links:

Type

🐛 Bug Fix
✅ Test

Changes

Code changes

Unsupported types: Added a check so that if analyze_results is not a list (e.g. string, None), the code logs a warning and returns [].
List iteration: Before PresidioAnalyzeResponseItem(**item), added a check that item is a dict. Non-dict items are skipped with a warning.
Exception handling: The try/except around PresidioAnalyzeResponseItem(**item) now catches all parsing errors (not just TypeError), logs them, and continues

ref docs on this:
-https://microsoft.github.io/presidio/api/analyzer_python/
-https://microsoft.github.io/presidio/api-docs/api-docs.html
-https://microsoft.github.io/presidio/api-docs/api-docs.html

vercel · 2026-02-08T02:41:17Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
litellm	Ready	Preview, Comment	Feb 11, 2026 2:02am

greptile-apps · 2026-02-08T02:49:07Z

Greptile Overview

Greptile Summary

This PR hardens the Presidio guardrail hook’s /analyze response handling so it no longer crashes when Presidio returns malformed JSON shapes (e.g., a plain string error message, None, or non-dict items inside a list). It adds type checks to treat non-list responses as empty detections and skips invalid list items while continuing to parse valid ones. New unit tests cover string responses and mixed-type lists by mocking the analyzer API response.

Confidence Score: 3/5

This PR is likely safe but needs a guardrail-failure decision for malformed Presidio responses and slightly more robust tests.
Core change is small and targeted (type checks + skip invalid items) and addresses a real crash scenario. However, defaulting to [] for unexpected analyzer response types can silently disable blocking behavior if Presidio is misconfigured or intermittently failing. Tests also patch aiohttp directly, which may not exercise the actual session abstraction used by the hook in all configurations.
litellm/proxy/guardrails/guardrail_hooks/presidio.py; tests/test_litellm/proxy/guardrails/guardrail_hooks/test_presidio.py

Important Files Changed

Filename	Overview
litellm/proxy/guardrails/guardrail_hooks/presidio.py	Adds defensive handling for Presidio analyzer responses that are non-lists and skips non-dict items; main concern is that returning [] on unexpected types can silently disable blocking guardrails if Presidio misbehaves.
tests/test_litellm/proxy/guardrails/guardrail_hooks/test_presidio.py	Adds tests for string analyzer responses and mixed-type lists; tests may be brittle because they patch aiohttp.ClientSession rather than the code’s session abstraction (_get_session_iterator), so they may not exercise the intended path if session reuse/close logic changes.

Sequence Diagram

sequenceDiagram
  participant Hook as PresidioPIIMasking
  participant Sess as aiohttp ClientSession
  participant Analyzer as Presidio Analyzer API

  Hook->>Sess: POST {analyzer_base}/analyze (analyze_payload)
  Sess->>Analyzer: HTTP request
  Analyzer-->>Sess: JSON body (list|dict|string|null)
  Sess-->>Hook: analyze_results = response.json()

  alt analyze_results is dict
    Hook->>Hook: if 'error' in dict => return []
    Hook->>Hook: else try PresidioAnalyzeResponseItem(**dict)
  else analyze_results is not list
    Hook->>Hook: warn + return []
  else analyze_results is list
    loop each item
      alt item is dict
        Hook->>Hook: parse PresidioAnalyzeResponseItem(**item)
      else item not dict
        Hook->>Hook: warn + skip
      end
    end
    Hook-->>Hook: return final_results
  end

greptile-apps

_{2 files reviewed, 2 comments}

_{Edit Code Review Agent Settings | Greptile}

litellm/proxy/guardrails/guardrail_hooks/presidio.py

tests/test_litellm/proxy/guardrails/guardrail_hooks/test_presidio.py

krrishdholakia · 2026-02-08T07:06:51Z

@greptile please review this

shivamrawat1 · 2026-02-11T01:32:12Z

@greptile review this

greptile-apps · 2026-02-11T01:35:24Z

Greptile Overview

Greptile Summary

Fixes a TypeError crash when the Presidio analyzer API returns a non-list response (e.g., a plain string error message). Adds type checks for analyze_results and individual list items, and introduces fail-closed semantics (GuardrailRaisedException) when PII protection is configured, preventing silent bypass of PII detection on malformed responses.

Added _fail_on_invalid_response helper that raises GuardrailRaisedException when pii_entities_config, output_parse_pii, or apply_to_output is set, and falls back to empty list otherwise
Added type guard for non-list/non-dict analyze_results (handles string, None, etc.)
Added per-item isinstance(item, dict) check when iterating list results
Broadened except TypeError to except Exception for parsing failures
Added 5 new mock-only tests covering string responses, error dicts, fail-closed behavior, and mixed-type list items
_has_block_action() method was added but is unused — consider removing it

Confidence Score: 4/5

This PR is safe to merge — it adds defensive error handling to an existing code path with appropriate fail-closed behavior.
The changes are well-scoped, correctly implement fail-closed semantics when PII protection is configured, and include comprehensive test coverage. Minor issue: one unused method (_has_block_action). No risk of regression to existing behavior.
No files require special attention beyond the minor dead code in litellm/proxy/guardrails/guardrail_hooks/presidio.py.

Important Files Changed

Filename	Overview
litellm/proxy/guardrails/guardrail_hooks/presidio.py	Adds robust error handling for invalid Presidio API responses (string, None, malformed dict/list). Implements fail-closed semantics via `GuardrailRaisedException` when PII protection is configured. Contains one unused method `_has_block_action()`.
tests/test_litellm/proxy/guardrails/guardrail_hooks/test_presidio.py	Adds 5 new test cases covering string responses, error dicts, fail-closed with BLOCK/MASK configs, and mixed list items. Uses a well-structured `_make_mock_session_iterator` helper that patches `_get_session_iterator` directly. All tests are mock-only.

greptile-apps

_{2 files reviewed, 1 comment}

_{Edit Code Review Agent Settings | Greptile}

greptile-apps · 2026-02-11T01:35:27Z

litellm/proxy/guardrails/guardrail_hooks/presidio.py

+    def _has_block_action(self) -> bool:
+        """Return True if pii_entities_config has any BLOCK action (fail-closed on analyzer errors)."""
+        if not self.pii_entities_config:
+            return False
+        return any(
+            action == PiiAction.BLOCK for action in self.pii_entities_config.values()
+        )


_has_block_action() is defined but never called anywhere in the codebase. The fail-closed logic in _fail_on_invalid_response uses bool(self.pii_entities_config) instead. Remove this dead code or use it in the should_fail_closed check if the intent was to only fail-closed on BLOCK actions.

ishaan-jaff

lgtm

…e OpenAI (#20883) * feat: add opus 4.5 and 4.6 to use outout_format param * generate poetry lock with 2.3.2 poetry * restore poetry lock * e2e tests, key delete, update tpm rpm, and regenerate * Split e2e ui testing for browser * new login with sso button in login page * option to hide usage indicator * fix(cloudzero): update CBF field mappings per LIT-1907 (#20906) * fix(cloudzero): update CBF field mappings per LIT-1907 Phase 1 field updates for CloudZero integration: ADD/UPDATE: - resource/account: Send concat(api_key_alias, '|', api_key_prefix) - resource/service: Send model_group instead of service_type - resource/usage_family: Send provider instead of hardcoded 'llm-usage' - action/operation: NEW - Send team_id - resource/id: Send model name instead of CZRN - resource/tag:organization_alias: Add if exists - resource/tag:project_alias: Add if exists - resource/tag:user_alias: Add if exists REMOVE: - resource/tag:total_tokens: Removed - resource/tag:team_id: Removed (team_id now in action/operation) Fixes LIT-1907 * Update litellm/integrations/cloudzero/transform.py Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> * fix: define api_key_alias variable, update CBFRecord docstring - Fix F821 lint error: api_key_alias was used but not defined - Update CBFRecord docstring to reflect LIT-1907 field mappings - Remove unused Optional import --------- Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com> Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> * Add banner notifying of breaking change * Add semgrep & Fix OOMs (#20912) * [Feat] Policies - Allow connecting Policies to Tags, Simulating Policies, Viewing how many keys, teams it applies on (#20904) * init schema with TAGS * ui: add policy test * resolvePoliciesCall * add_policy_sources_to_metadata + headers * types Policy * preview Impact * def _describe_match_reason( * match based on TAGs * TestTagBasedAttachments * test fixes * add policy_resolve_router * add_guardrails_from_policy_engine * TestMatchAttribution * refactor * fix * fix: address Greptile review feedback on policy resolve endpoints - Track unnamed keys/teams as separate counts instead of inflating affected_keys_count with duplicate "(unnamed key)" placeholders. Added unnamed_keys_count and unnamed_teams_count to response. - Push alias pattern matching to DB via _build_alias_where() which converts exact patterns to Prisma "in" and suffix wildcards to "startsWith" filters. - Gate sync_policies_from_db/sync_attachments_from_db behind force_sync query param (default false) to avoid 2 DB round-trips on every /policies/resolve request. - Remove worktree-only conftest.py that cleared sys.modules at import time — no longer needed since code moved to main repo. - Rename MAX_ESTIMATE_IMPACT_ROWS → MAX_POLICY_ESTIMATE_IMPACT_ROWS. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: eliminate duplicate DB queries and fix header delimiter ambiguity - Fetch teams table once in estimate_attachment_impact and reuse for both tag-based and alias-based lookups (was querying teams twice when both tag_patterns and team_patterns were provided). - Convert tag/team filter functions from async DB queries to sync filters that operate on pre-fetched data (_filter_keys_by_tags, _filter_teams_by_tags). - Fix comma ambiguity in x-litellm-policy-sources header: use '; ' as entry delimiter since matched_via values can contain commas. - Use '+' as the within-value separator in matched_via reason strings (e.g. "tag:healthcare+team:health-team") to avoid conflict with header delimiters. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Update litellm/proxy/policy_engine/policy_resolve_endpoints.py Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> * fix: type error & better error handling (#20689) * [Docs] Add docs guide for using policies (#20914) * init schema with TAGS * ui: add policy test * resolvePoliciesCall * add_policy_sources_to_metadata + headers * types Policy * preview Impact * def _describe_match_reason( * match based on TAGs * TestTagBasedAttachments * test fixes * add policy_resolve_router * add_guardrails_from_policy_engine * TestMatchAttribution * refactor * fix * fix: address Greptile review feedback on policy resolve endpoints - Track unnamed keys/teams as separate counts instead of inflating affected_keys_count with duplicate "(unnamed key)" placeholders. Added unnamed_keys_count and unnamed_teams_count to response. - Push alias pattern matching to DB via _build_alias_where() which converts exact patterns to Prisma "in" and suffix wildcards to "startsWith" filters. - Gate sync_policies_from_db/sync_attachments_from_db behind force_sync query param (default false) to avoid 2 DB round-trips on every /policies/resolve request. - Remove worktree-only conftest.py that cleared sys.modules at import time — no longer needed since code moved to main repo. - Rename MAX_ESTIMATE_IMPACT_ROWS → MAX_POLICY_ESTIMATE_IMPACT_ROWS. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: eliminate duplicate DB queries and fix header delimiter ambiguity - Fetch teams table once in estimate_attachment_impact and reuse for both tag-based and alias-based lookups (was querying teams twice when both tag_patterns and team_patterns were provided). - Convert tag/team filter functions from async DB queries to sync filters that operate on pre-fetched data (_filter_keys_by_tags, _filter_teams_by_tags). - Fix comma ambiguity in x-litellm-policy-sources header: use '; ' as entry delimiter since matched_via values can contain commas. - Use '+' as the within-value separator in matched_via reason strings (e.g. "tag:healthcare+team:health-team") to avoid conflict with header delimiters. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * docs v1 guide with UI imgs * docs fix --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> * feat: add dashscope/qwen3-max model with tiered pricing (#20919) Add support for Alibaba Cloud's Qwen3-Max model with: - 258K input tokens, 65K output tokens - Tiered pricing based on context window usage (0-32K, 32K-128K, 128K-252K) - Function calling and tool choice support - Reasoning capabilities enabled Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com> * fix linting * docs: add Greptile review requirement to PR template (#20762) * fix(azure): preserve content_policy_violation error details from Azure OpenAI Closes #20811 Azure OpenAI returns rich error payloads for content policy violations (inner_error with ResponsibleAIPolicyViolation, content_filter_results, revised_prompt). Previously these details were lost when: 1. The top-level error code was not "content_policy_violation" but the inner_error.code was "ResponsibleAIPolicyViolation" -- the structured check only examined the top-level code. 2. The DALL-E image generation polling path stringified the error JSON into the message field instead of setting the structured body, making it impossible for exception_type() to extract error details. 3. The string-based fallback detector used "invalid_request_error" as a content-policy indicator, which is too broad and could misclassify regular bad-request errors. Changes: - exception_mapping_utils.py: Check inner_error.code for ResponsibleAIPolicyViolation when top-level code is not content_policy_violation. Replace overly broad "invalid_request_error" string match with specific Azure safety-system messages. - azure.py: Set structured body on AzureOpenAIError in both async and sync DALL-E polling paths so exception_type() can inspect error details. - test_azure_exception_mapping.py: Add regression tests covering the exact error payloads from issue #20811. - Fix pre-existing lint: duplicate PerplexityResponsesConfig dict key, unused RouteChecks top-level import. --------- Co-authored-by: Kelvin Tran <kelvin-tran@users.noreply.github.com> Co-authored-by: yuneng-jiang <yuneng.jiang@gmail.com> Co-authored-by: shin-bot-litellm <shin-bot-litellm@berri.ai> Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com> Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> Co-authored-by: Alexsander Hamir <alexsanderhamirgomesbaptista@gmail.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: Harshit Jain <48647625+Harshit28j@users.noreply.github.com> Co-authored-by: ken <122603020@qq.com> Co-authored-by: Sameer Kankute <sameer@berri.ai>

vercel bot deployed to Preview February 8, 2026 02:42 View deployment

Harshit28j marked this pull request as ready for review February 8, 2026 02:47

greptile-apps bot reviewed Feb 8, 2026

View reviewed changes

litellm/proxy/guardrails/guardrail_hooks/presidio.py Outdated Show resolved Hide resolved

tests/test_litellm/proxy/guardrails/guardrail_hooks/test_presidio.py Show resolved Hide resolved

fix: type error & better error handling

5d64de3

Harshit28j force-pushed the litellm_fix_presidio_type_err branch from 5b91d01 to 5d64de3 Compare February 8, 2026 04:20

vercel bot deployed to Preview February 8, 2026 04:22 View deployment

greptile-apps bot reviewed Feb 11, 2026

View reviewed changes

ishaan-jaff approved these changes Feb 11, 2026

View reviewed changes

Merge branch 'main' into litellm_fix_presidio_type_err

e6504c3

vercel bot deployed to Preview February 11, 2026 02:02 View deployment

ishaan-jaff merged commit bc06226 into BerriAI:main Feb 11, 2026
5 of 8 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: type error & better error handling#20689

fix: type error & better error handling#20689
ishaan-jaff merged 2 commits intoBerriAI:mainfrom
Harshit28j:litellm_fix_presidio_type_err

Harshit28j commented Feb 8, 2026 •

edited

Loading

Uh oh!

vercel bot commented Feb 8, 2026 •

edited

Loading

Uh oh!

greptile-apps bot commented Feb 8, 2026

Important Files Changed

Uh oh!

greptile-apps bot left a comment

Uh oh!

Uh oh!

Uh oh!

krrishdholakia commented Feb 8, 2026

Uh oh!

shivamrawat1 commented Feb 11, 2026

Uh oh!

greptile-apps bot commented Feb 11, 2026

Important Files Changed

Uh oh!

greptile-apps bot left a comment

Uh oh!

greptile-apps bot Feb 11, 2026

Uh oh!

ishaan-jaff left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Uh oh!

Conversation

Harshit28j commented Feb 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Relevant issues

Pre-Submission checklist

CI (LiteLLM team)

Type

Changes

Uh oh!

vercel bot commented Feb 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

greptile-apps bot commented Feb 8, 2026

Greptile Overview

Greptile Summary

Confidence Score: 3/5

Important Files Changed

Sequence Diagram

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

krrishdholakia commented Feb 8, 2026

Uh oh!

shivamrawat1 commented Feb 11, 2026

Uh oh!

greptile-apps bot commented Feb 11, 2026

Greptile Overview

Greptile Summary

Confidence Score: 4/5

Important Files Changed

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Feb 11, 2026

Choose a reason for hiding this comment

Uh oh!

ishaan-jaff left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Harshit28j commented Feb 8, 2026 •

edited

Loading

vercel bot commented Feb 8, 2026 •

edited

Loading